Contractivity of Bellman operator in risk averse dynamic programming with infinite horizon
نویسندگان
چکیده
The paper deals with a risk averse dynamic programming problem infinite horizon. First, the required assumptions are formulated to have well defined. Then Bellman equation is derived, which may be also seen as standalone reinforcement learning problem. fact that operator contraction proved, guaranteeing convergence of various solution algorithms used for problems, we demonstrate on value iteration and policy algorithms.
منابع مشابه
Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures
In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk mea...
متن کاملDynamic linear programming games with risk-averse players
Motivated by situations in which independent agents, or players, wish to cooperate in some uncertain endeavor over time, we study dynamic linear programming games, which generalize classical linear production games to multi-period settings under uncertainty. We specifically consider that players may have risk-averse attitudes towards uncertainty, and model this risk aversion using coherent cond...
متن کاملRisk neutral and risk averse Stochastic Dual Dynamic Programming method
In this paper we discuss risk neutral and risk averse approaches to multistage (linear) stochastic programming problems based on the Stochastic Dual Dynamic Programming (SDDP) method. We give a general description of the algorithm and present computational studies related to planning of the Brazilian interconnected power system. 2012 Elsevier B.V. All rights reserved.
متن کاملInterchangeability principle and dynamic equations in risk averse stochastic programming
In this paper we consider interchangeability of the minimization operator with monotone risk functionals. In particular we discuss the role of strict monotonicity of the risk functionals. We also discuss implications to solutions of dynamic programming equations of risk averse multistage stochastic programming problems.
متن کاملStabilizing Policy Improvement for Large-Scale Infinite-Horizon Dynamic Programming
Today’s focus on sustainability within industry presents a modeling challenge that may be dealt with using dynamic programming over an infinite time horizon. However, the curse of dimensionality often results in a large number of states in these models. These large-scale models require numerically stable solution methods. The best method for infinite-horizon dynamic programming depends on both ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Operations Research Letters
سال: 2023
ISSN: ['0167-6377', '1872-7468']
DOI: https://doi.org/10.1016/j.orl.2023.01.008